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AMENDMENTS TO THE CLAIMS 

This listing of claims will replace all prior versions and listings of claims in this 
application: 

Listing of Claims: 

1 . (Currently amended) A processor-implemented method of collaborative 
focused crawling of documents related to multiple focus topics on a network, 
the method comprising; 

selectively prioritizing the documents to crawl based on a set of rules; 
fetching prioritized documents from the network; 
for each fetched document, determining whether the fetched 
document is relevant to any of the multiple focus topics; 

crawling the fetched document that matches any of the multiple focus 

topics; €H=>€l 

further crawling out-links on the fetched document based on an 
assumption that if the fetched document is of interest, the out-links are also of 
interest 

determining whether the fetched document should be disallowed, and 
upon determination that the fetched document should be disallowed: 

selectively disallowing the fetched document; 

identifying a resource locator string associated with the fetched 
document; and 

placing the resource locator string for the fetched document in a 
blacklist in order to prevent future crawling of the fetched document . 

2. (Original) The method of claim 1 , further comprising seeding a plurality of 
seed uniform resource locator strings to start the collaborative focused 
crawling of the documents. 
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3. (Original) The method of claim 2, further comprising crawling the seed 
uniform resource locator strings. 

4. (Original) The method of claim 3, further comprising writing a plurality of 
resulting uniform resource locator strings obtained by crawling the seed 
uniform resource locator strings. 

5. (Original) The method of claim 4, further comprising a foreman function 
for reading a plurality of contents of the resulting uniform resource locator 
strings. 

6. (Original) The method of claim 5, further comprising the foreman 
function passing the contents of the resulting uniform resource locator strings to 
a miner. 

7. (Original) The method of claim 6., further comprising the miner instructing 
a fetcher to crawl a plurality of out-links on a document of the resulting 
resource locator string when the contents of the resulting resource locator 
siring match a focus topic of the miner. 

8. (Original) The method of claim 6, further comprising the miner ignoring 
resulting resource locator string when the contents of the resulting resource 
locator string do not match the focus of the miner. 

9. (Original) The method of claim 6, further comprising the miner managing 
a plurality of focus topics. 
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1 0. (Original) The method of claim 9, further comprising the miner allowing a 
crawling of the resulting resource locator string when the resulting resource 
locator string matches a plurality of web space rules. 

1 1 . (Currently amended) The method of claim [[9]] 10, wherein the web 
space rules comprise domain rules, IP address rules, and prefix rules. 

1 2. (Currently amended) The method of claim [[9]] JO, further comprising 
the miner disallowing the crawling of the resulting resource locator string when 
the content of the resulting resource locator string matches a focus topic of 
the miner. 

1 3. (Currently amended} The method of claim [[6]} JO, wherein the miner 
comprises an unfocus miner that places the resulting uniform resource locator 
strings that match an unfocus topic in [aj the blacklist, so that the uniform 
resource locator strings will not be crawled again. 

1 4. (Currently amended) A computer program product having a plurality of 
executable instruction codes stored on a computer readable storage medium, 
for implementing a collaborative focused crawling of documents related to 
multiple focus topics on a network, the computer program product comprising: 

a first set of instruction codes for selectively prioritizing the documents to 
crawl based on a set of rules; 

a second set of instruction codes for fetching prioritized documents from 
the network; 

for each fetched document, a third set of instruction codes determines 
whether the fetched document is relevant to any of the multiple focus topics; 
a fourth set of instruction codes for crawling the fetched document that 
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matches any of the multiple focus topics; e«4 

wherein the fourth set of instruction codes further crawls out-Sinks on the 
fetched document based on an assumption that if the fetched document is of 
interest, the ouf-iinks are also of interest; 

wherein the fourth set of instruction codes further determines whether 
the fetched document should be disallowed, and upon determination that the 
fetched document should be disallowed: 

selectively disallowing the fetched document; 

identifying a resource locator string associated with the fetched 

document; and 

placing the resource locator string, for the fetched document in a 
blacklist in order to prevent future crawling of the fetched document . 

1 5. (Original) The computer program product of claim 1 4, further comprising 
a fifth set of instruction codes for seeding a plurality of seed uniform resource 
locator strings to start the collaborative focused crawling of the documents. 

1 6. (Original) The computer program product of claim 15, wherein the fourth 
set of instruction codes further crawls the seed uniform resource locator strings. 

1 /. (Original) 1 he computer program product ot claim 16, further comprising 
a sixth set of instruction codes for writing a plurality of resulting uniform resource 
locator strings obtained by crawling the seed uniform resource locator strings. 
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1 8. (Currently amended) A processor-implemented system for implementing 
a collaborative focused crawling of documents related to multiple focus topics 
on a network, the system comprising: 

an evaluator that selectively prioritizes the documents to crawl based on 
a set of rules; 

a fetcher that fetches prioritized documents from the network; 

for each fetched document, a focus engine determines whether the 
fetched document is relevant to any of the multiple focus topics; 

a crawler for crawling the fetched document that matches any of the 
multiple focus topics; end 

wherein the crawler further crawls out-links on the fetched document 
based on an assumption that if the fetched document is of interest, the out- 
links are also of interest; 

wherein the crawler further determines whether the fetched document 
should be disallowed, and upon determination that the fetched document 
should be disallowed: 

selectively disallowing the fetched document; 

identifying a resource locator string associated with the fetched 

document; and 

placing the resource locator string for the fetched document in a 
blacklist in order to prevent future crawling of the fetched document . 

19. (Currently amended) The system of claim [[14]] ]8, further comprising a 
plurality of seed uniform resource locator strings that are used to initiate the 
collaborative focused crawling of the documents. 

20. (Currently amended} The system of claim [[1 5j] 19, wherein the crawler 
further crawls the seed uniform resource locator strings. 
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